90 research outputs found

    The density connectivity information bottleneck

    Full text link
    Clustering with the agglomerative Information Bottleneck (aIB) algorithm suffers from the sub-optimality problem, which cannot guarantee to preserve as much relative information as possible. To handle this problem, we introduce a density connectivity chain, by which we consider not only the information between two data elements, but also the information among the neighbors of a data element. Based on this idea, we propose DCIB, a Density Connectivity Information Bottleneck algorithm that applies the Information Bottleneck method to quantify the relative information during the clustering procedure. As a hierarchical algorithm, the DCIB algorithm produces a pruned clustering tree-structure and gets clustering results in different sizes in a single execution. The experiment results in the documentation clustering indicate that the DCIB algorithm can preserve more relative information and achieve higher precision than the aIB algorithm.<br /

    Intelligent techniques for recommender systems

    Full text link
    This thesis focuses on the data sparsity issue and the temporal dynamic issue in the context of collaborative filtering, and addresses them with imputation techniques, low-rank subspace techniques and optimizations techniques from the machine learning perspective. A comprehensive survey on the development of collaborative filtering techniques is also included

    Intensity-free Integral-based Learning of Marked Temporal Point Processes

    Full text link
    In the marked temporal point processes (MTPP), a core problem is to parameterize the conditional joint PDF (probability distribution function) p∗(m,t)p^*(m,t) for inter-event time tt and mark mm, conditioned on the history. The majority of existing studies predefine intensity functions. Their utility is challenged by specifying the intensity function's proper form, which is critical to balance expressiveness and processing efficiency. Recently, there are studies moving away from predefining the intensity function -- one models p∗(t)p^*(t) and p∗(m)p^*(m) separately, while the other focuses on temporal point processes (TPPs), which do not consider marks. This study aims to develop high-fidelity p∗(m,t)p^*(m,t) for discrete events where the event marks are either categorical or numeric in a multi-dimensional continuous space. We propose a solution framework IFIB (\underline{I}ntensity-\underline{f}ree \underline{I}ntegral-\underline{b}ased process) that models conditional joint PDF p∗(m,t)p^*(m,t) directly without intensity functions. It remarkably simplifies the process to compel the essential mathematical restrictions. We show the desired properties of IFIB and the superior experimental results of IFIB on real-world and synthetic datasets. The code is available at \url{https://github.com/StepinSilence/IFIB}

    Mining Medical Data: Bridging the Knowledge Divide

    Get PDF
    Due to the signi¯cant amount of data generated by modern medicine there is a growing reliance on tools such as data mining and knowledge discovery to help make sense and comprehend such data. The success of this process requires collaboration and interaction between such methods and medical professionals. Therefore an important question is: How can we strengthen the relationship between two traditionally separate fields (technology and medicine) in order to work simultaneously towards enhancing knowledge in modern medicine. To address this question, this study examines the application of data mining techniques to a large asthma medical dataset. A discussion introducing various methods for a smooth approach, straying from the `jack of all trades, master of none' to a modular cooperative approach for a successful outcome is pro-posed. The results of this study support the use of data mining as a useful tool and highlight the advantages on a global scale of closer relations between the two distinct fields. The exploration of CRISP methodology suggests that a `one methodology fits all approach' is not appropriate, but rather combines to create a hybrid holistic approach to data mining

    G-CREWE: Graph CompREssion With Embedding for Network Alignment

    Full text link
    Network alignment is useful for multiple applications that require increasingly large graphs to be processed. Existing research approaches this as an optimization problem or computes the similarity based on node representations. However, the process of aligning every pair of nodes between relatively large networks is time-consuming and resource-intensive. In this paper, we propose a framework, called G-CREWE (Graph CompREssion With Embedding) to solve the network alignment problem. G-CREWE uses node embeddings to align the networks on two levels of resolution, a fine resolution given by the original network and a coarse resolution given by a compressed version, to achieve an efficient and effective network alignment. The framework first extracts node features and learns the node embedding via a Graph Convolutional Network (GCN). Then, node embedding helps to guide the process of graph compression and finally improve the alignment performance. As part of G-CREWE, we also propose a new compression mechanism called MERGE (Minimum dEgRee neiGhbors comprEssion) to reduce the size of the input networks while preserving the consistency in their topological structure. Experiments on all real networks show that our method is more than twice as fast as the most competitive existing methods while maintaining high accuracy.Comment: 10 pages, accepted at the 29th ACM International Conference onInformation and Knowledge Management (CIKM 20

    A Radiation Viewpoint of Reconfigurable Reflectarray Elements: Performance Limit, Evaluation Criterion and Design Process

    Full text link
    Reconfigurable reflectarray antennas (RRAs) have rapidly developed with various prototypes proposed in recent literatures. However, designing wideband, multiband, or high-frequency RRAs faces great challenges, especially the lengthy simulation time due to the lack of systematic design guidance. The current scattering viewpoint of the RRA element, which couples antenna structures and switches during the design process, fails to address these issues. Here, we propose a novel radiation viewpoint to model, evaluate, and design RRA elements. Using this viewpoint, the design goal is to match the element impedance to a characteristic impedance pre-calculated by switch parameters, allowing various impedance matching techniques developed in classical antennas to be applied in RRA element design. Furthermore, the theoretical performance limit can be pre-determined at given switch parameters before designing specific structures, and the constant loss curve is suggested as an intuitive tool to evaluate element performance in the Smith chart. The proposed method is validated by a practical 1-bit RRA element with degraded switch parameters. Then, a 1-bit RRA element with wideband performance is successfully designed using the proposed design process. The proposed method provides a novel perspective of RRA elements, and offers a systematic and effective guidance for designing wideband, multiband, and high-frequency RRAs.Comment: Accepted by IEEE Transactions on Antennas and Propagatio
    • …
    corecore